Search CORE

4 research outputs found

Beyond the socket: NUMA-aware GPUs

Author: Arunkumar Akhil
Bolotin Evgeny
Ebrahimi Eiman
Jaleel Aamer
Nellans David
Ramirez Alex
Ugljesa Milic
Villa Oreste
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/10/2017
Field of study

GPUs achieve high throughput and power efficiency by employing many small single instruction multiple thread (SIMT) cores. To minimize scheduling logic and performance variance they utilize a uniform memory system and leverage strong data parallelism exposed via the programming model. With Moore's law slowing, for GPUs to continue scaling performance (which largely depends on SIMT core count) they are likely to embrace multi-socket designs where transistors are more readily available. However when moving to such designs, maintaining the illusion of a uniform memory system is increasingly difficult. In this work we investigate multi-socket non-uniform memory access (NUMA) GPU designs and show that significant changes are needed to both the GPU interconnect and cache architectures to achieve performance scalability. We show that application phase effects can be exploited allowing GPU sockets to dynamically optimize their individual interconnect and cache policies, minimizing the impact of NUMA effects. Our NUMA-aware GPU outperforms a single GPU by 1.5×, 2.3×, and 3.2× while achieving 89%, 84%, and 76% of theoretical application scalability in 2, 4, and 8 sockets designs respectively. Implementable today, NUMA-aware multi-socket GPUs may be a promising candidate for scaling GPU performance beyond a single socket.We would like to thank anonymous reviewers and Steve Keckler for their help in improving this paper. The first author is supported by the Ministry of Economy and Competitiveness of Spain (TIN2012-34557, TIN2015-65316-P, and BES-2013-063925)Peer ReviewedPostprint (published version

UPCommons. Portal del coneixement obert de la UPC

Using Low Cost Erasure and Error Correction Schemes to Improve Reliability of Commodity DRAM Systems

Author: Akhil Arunkumar
Carole-Jean Wu
Chaitali Chakrabarti
David Blaauw
Hsing-Min Chen
Supreet Jeloka
Trevor Mudge
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref

Beyond the socket: NUMA-aware GPUs

Author: Arunkumar Akhil
Bolotin Evgeny
Ebrahimi Eiman
Jaleel Aamer
Nellans David
Ramirez Alex
Ugljesa Milic
Villa Oreste
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

RECERCAT

Synthesis of 2-deoxy- d

Author: Aft
Ahangaran
Akbarzadeh
Akhil K. Dubey
Alessio
Ammar
Anand Ballal
Anselmo
Armarego
Arunkumar S. Koijam
Asadabad
Barar
Basu
Basu
Blanářová
Brown
Butler
Byrn
Caminade
Canetta
Chandan Kumar
Chen
Chen
Chin
Choi
Das
Dasari
Dhar
Fang
Ferjaoui
Florea
Gabano
Gao
Gao
Gatti
Ge
Ghosn
Gibson
Gonzalez
Guardia
Gupta
Hall
Hall
He
He
Hofmann
Housman
Huang
Huang
Hufschmid
Huseynov
Häfeli
Jain
Jiang
Johnstone
Jung
K. Shitaljit Sharma
Kandasamy
Karasawa
Karimzadeh
Kenny
Khot
Kolosnjaj-Tabi
Lee
Li
Li
Liberti
Lim
Liu
Locke
Lombardo
Ma
Maharramov
Mahmoudi
Maki
Mansoor
Mansoori
Medina
Mees
Monaco
Montagner
Montalbetti
Morel
Mosmann
Neamtu
Nemirovski
Novohradsky
Otto
Pinheiro
Prasad P. Phadnis
Pyrz
Qi
Rahimi
Rajesh K. Vatsa
Ravera
Ravera
Reddy
Rosenberg
Russell
Rybak
Sadhukha
Santos
Sastry
Sato
Sedletska
Selvan
Senapati
Shan
Sharma
Shi
Singh
Singh
Song
Spicer
Stöber
Sudip Mukherjee
Tian
Tiwari
Unsoy
Valeur
Verma
Wang
Wang
Wexselblatt
Wierzbinski
Wlassoff
Wu
Wu
Wu
Xi
Xie
Xu
Xu
Yang
Yang
Ye
Yew
Yi
Yu
Yuvakkumar
Zanellato
Zerrouki
Zhang
Zhang
Zhao
Zhao
Zheng
Publication venue: 'Royal Society of Chemistry (RSC)'
Publication date: 01/01/2020
Field of study

Crossref